Exploring Performance-Correctness Explicitly-Decoupled Architectures
نویسندگان
چکیده
Optimizing the common case has been an adage in decades of processor design practices. However, as the system complexity and optimization techniques’ sophistication have increased substantially, maintaining correctness under all situations, however unlikely, is contributing to the necessity of extra conservatism in all layers of the system design. The mounting process, voltage, and temperature variation concerns further add to the conservatism in setting operating parameters. Excessive conservatism in turn hurts performance and efficiency in the common case. However, much of the system’s complexity comes from advanced performance features and may not compromise the whole system’s functionality and correctness even if some components are imperfect and introduce occasional errors. In this thesis, we propose to separate performance goals from the correctness goal using an explicitly-decoupled architecture. As a proof-of-concept, we discuss two such incarnations for an out-of-order microprocessor. First, we discuss how explicitly-decoupled architecture can be used to implement an efficient mechanism to track and enforce memory dependences. Later, we discuss enhancements to improve traditional ILP (instruction-level parallelism). In both the designs a decoupled performance enhancement engine performs optimistic execution and helps an independent correctness engine by passing high-quality predictions. The lack of concern for correctness in the performance domain allows us to optimize its execution in a more effective fashion than possible in optimizing a monolithic design with correctness requirements. In this thesis we show that such a decoupled design allows significant optimization benefits and is much less sensitive to conservatism applied in the correctness domain.
منابع مشابه
A Performance-Correctness Explicitly-Decoupled Architecture: Technical Report
Optimizing the common case has been an adage in decades of processor design practices. However, as the system complexity and optimization techniques’ sophistication have increased substantially, maintaining correctness under all situations, however unlikely, is contributing to the necessity of extra conservatism in all layers of the system design. The mounting process, voltage, and temperature ...
متن کاملScheduled Dataflow: Execution Paradigm, Architecture, and Performance Evaluation
ÐIn this paper, the Scheduled Dataflow (SDF) architectureÐa decoupled memory/execution, multithreaded architecture using nonblocking threadsÐis presented in detail and evaluated against Superscalar architecture. Recent focus in the field of new processor architectures is mainly on VLIW (e.g., IA-64), superscalar, and superspeculative designs. This trend allows for better performance, but at the...
متن کاملDecoupled Architectures for Complexity-Effective General Purpose Processors
Decoupled architectures have previously been investigated in the context of high performance scientific computing. For general purpose computing, however, superscalar processors have proven to be flexible in providing high performance across a wide range of applications. To achieve this goal, these architectures have incorporated enormous amounts of complexity to obtain modest performance impro...
متن کاملDesign of magnetic dipole based 3D integration nano-circuits for future electronics application
Nano Magnetic Logic (NML) has been attracting application in optical computing, nanodevice formation, and low power. In this paper nanoscale architecture such as the decoder, multiplexer, and comparator are implemented on perpendicular-nano magnetic logic (pNML) technology. All these architectures with the superiority of minimum complexity and minimum delay are pointed. The proposed architectur...
متن کاملDesign of magnetic dipole based 3D integration nano-circuits for future electronics application
Nano Magnetic Logic (NML) has been attracting application in optical computing, nanodevice formation, and low power. In this paper nanoscale architecture such as the decoder, multiplexer, and comparator are implemented on perpendicular-nano magnetic logic (pNML) technology. All these architectures with the superiority of minimum complexity and minimum delay are pointed. The proposed architectur...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011